Aggregating Research Papers from Publishers’ Systems to Support Text and Data Mining: Deliberate Lack of Interoperability or Not?
نویسندگان
چکیده
In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all research papers, i.e. the collection of human knowledge so large no one can ever read in their lifetime, represents one of the most exciting opportunities for TDM. Although the Open Access movement, which has been advocating for free availability and reuse rights to TDM from research papers, has achieved some major successes on the legal front, the technical interoperability of systems offering free access to research papers continues to be a challenge. COnnecting REpositories (CORE) (Knoth and Zdrahal, 2012) aggregates the world’s open access full-text scientific manuscripts from repositories, journals and publisher systems. One of the main goals of CORE is to harmonise and pre-process these data to lower the barrier for TDM. In this paper, we report on the preliminary results of an interoperability survey of systems provided by journal publishers, both open access and toll access. This helps us to assess the current level of systems’ interoperability and suggest ways forward.
منابع مشابه
The Open University ’ s repository of research publications and other research outputs Aggregating Research Papers from Publishers ’ Systems to Support Text and Data Mining : Deliberate Lack of Interoperability or Not ?
In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all resear...
متن کاملCross - Platform Text Mining and Natural Language Processing Interoperability PROCEEDINGS
In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all resear...
متن کاملTopic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملPressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges
Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on th...
متن کاملApplication of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)
Decision support systems (DSSs) are prevalent information systems for decision making in many competitive business environments. In a DSS, decision making process is intimately related to some factors which determine the quality of information systems and their related products. Traditional approaches to data analysis usually cannot be implemented in sophisticated Companies, where managers ne...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016